arXiv:1502.03917v2 [math.NA] 7 Jan 2016 


Efficient smoothers for all-at-once multigrid 
methods for Poisson and Stokes control problems 


Stefan Takacs* 

Mathematical Institute, University of Oxford, United Kingdom 
Stefan.takacsSnuma.uni-linz.ac.at 
http://www.numa.uni-linz.ac.at/~stefant/J3362/ 


Abstract. In the present paper we concentrate on an important issue in 
constructing a good multigrid solver: the choice of an efficient smoother. 
We will introduce all-at-once multigrid solvers for optimal control prob¬ 
lems which show robust convergence in the grid size and in the regular¬ 
ization parameter. We will refer to recent publications that guarantee 
such a convergence behavior. These publications do not pay much at¬ 
tention to the construction of the smoother and suggest to use a normal 
equation smoother. We will see that using a Gauss Seidel like variant of 
this smoother, the overall multigrid solver is speeded up by a factor of 
about two with no additional work. The author will give a proof which 
indicates that also the Gauss Seidel like variant of the smoother is cov¬ 
ered by the convergence theory. Numerical experiments suggest that the 
proposed method are competitive with Vanka type methods. 

Keywords: PDE-constrained optimization, All-at-once multigrid. Gauss 
Seidel 


1 Introduction 

In the present paper we discuss the construction of the all-at-once multigrid 
solvers for two model problems. The first model problem is a standard Poisson 
control problem'. Find a state y £ and a control u £ Lp'{Q) such that 

they minimize the cost functional 

J{y,y) ■= \\\y-yD\\l^(Q) + 

subject to the elliptic boundary value problem (BVP) 

—Ay + y = urn fi and = 0 on dfl. 

The desired state yo and the regularization parameter a > 0 are assumed to 
be given. Here and in what follows, I? C is a polygonal domain. We want to 
solve the finite element discretization of this problem using a fast linear solver 
which shows robust convergence behavior in the grid size and the regularization 
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parameter. For solving this problem, we use the method of Lagrange multipliers, 
cf. |5I6) . We obtain a linear system in the state y, the control u and the Lagrange 
multiplier A. In this linear system we eliminate the control as this has been done 
in |6l9j . We discretize the resulting system using the Courant element and obtain 
a linear system: 


(M, K, \(yA^ 
Ak ■■= 31/c := 



( 1 ) 


Here, Mk and Kk are the standard mass and stiffness matrices, respectively. The 
control can be recovered using the following simple relation from the Lagrange 
multiplier: U). = cf. [5]. In ma it was shown that there are constants 

C > 0 and C > 0 (independent of the grid size hk and the choice of a) such that 
the stability estimate 


\Q-k^^^AkQk^^^\\ 


< C 


and 




< c 


-1 


( 2 ) 


holds for the symmetric and positive definite matrix 


Qk '■= 


^Mk+a^AKk 


a ^Mk + a 


The second model problem is a standard Stokes control problem (velocity 
tracking problem): Find a velocity filed v G [iL^(l7)]‘^, a pressure distribution 
p G L^(I7) and a control u G [L'^{n)Y such that 

J{v,p,u) = l\\v - VD\\l 2 ^f 2 ) + 
is minimized subject to the Stokes equations 


—Av + Vp = u in 17, V • u = 0 in 17, u = 0 on 9l7. 


The regularization parameter a > 0 and the desired state (desired velocity field) 
vd G [L^(17)]‘^ are assumed to be given. To enforce uniqueness of the solution, 
we additionally require p dx = 0. 

Similar as above, we can set up the optimality system and eliminate the 
control, cf. UMl. The discretization can be done using the Taylor-Hood element. 
After these steps, we end up with the following linear system: 


(Mk 

Kk Dl\ 

(yk\ 


(lk\ 

0 

Dk 

Ik 


0 

KkDl 

-a~^Mk 

A/c 


0 

\Dk 

0 / 

\l^k) 


\qJ 


Afe := ItLk ■■= := 


where Mk and Kk are standard mass and stiffness matrices and is the dis¬ 
cretization of the gradient operator, see, e.g., fTW| . Again, we are interested in 
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a fast solver which is robust in the regularization parameter and the grid size. As 
in the previous example, the control U/. can by recovered from the Lagrange mul¬ 
tiplier: Uf. = In m it was shown that stability estimate © is satisfied 

for 

Qfc =block-diag(VLfc, aDkW-^Dl, a-^Wk, DkW-^Dl) , 
where Wk '■= Mk + a^^'^Kk- 


2 An all-at-once multigrid method 


The linear systems Q and ^ shall be solved by a multigrid method, which 
reads as follows. Starting from an initial approximation one iterate of the 
multigrid method is given by the following two steps: 

— Smoothing procedure: Compute 

:= + Afc ^ (/^ - Ak ^k’'^~^^) for m = 1,..., 1 / 

with = 3^k'^■ The choice of the smoother (or, in other words, of the 

matrix will be discussed below. 

— Coarse-grid correction: 

• Compute the defect f^—Ak restrict it to grid level k—l using 

an restriction matrix := — Ak 3±k'"^^ ■ 

• Solve the following coarse-grid problem approximatively: 

=^ 1-1 ( 4 ) 


• Prolongate p^kli to the grid level k using an prolongation matrix Ik-i 
and add the result to the previous iterate: := x^k’'^'' + 

As we have assumed to have nested spaces, the intergrid-transfer matrices can be 
chosen in a canonical way: Ik-i is the canonical embedding and the restriction 
is its (properly scaled) transpose. If the problem ([4|) is solved exactly, we 
obtain the two-grid method. In practice, the problem W is approximatively 
solved by applying one step (V-cycle) or two steps (W-cycle) of the multigrid 
method, recursively. Only the coarsest grid level, Q is solved exactly. 

The only part of the multigrid algorithm that has not been specified yet, is 
the smoother. For the choice of the smoother, we make use of the convergence 
theory. We develop a convergence theory based on Hackbusch’s splitting of the 
analysis into smoothing property and approximation property: 

— Smoothing property: 


(A( 4 °’''^ -¥*k),Xk] 


sup 


< r]{n)\\x^°^ 


XkWc 


k 


XkWc 




( 5 ) 
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should hold for some function rj{y) with limi,_>oo = 0- Here and in what 
follows, xl := is the exact solution, || • ||£j^ := ■= 

for some symmetric positive definite matrix Ck and (•, ■)i 2 is the standard 
Euclidean scalar product. 

— Approximation property: 



^k\\Ck<C!A sup 




£2 


\\±k\ 


Ck 


should hold for some constant Ca > 0. 

It is easy to see that, if we combine both conditions, we see that the two-grid 
method converges in the norm || • ||£^ for v large enough. The convergence of the 
W-cycle multigrid method can be shown under mild assumptions, see e.g. [5]. 

For the smoothing analysis, it is convenient to rewrite the smoothing property 
in pure matrix notation: ([^ is equivalent to 

( 6 ) 

For the Poisson control problem, it was shown in [5], that the approximation 
property is satisfied for the following choice of the matrix Ck (note that this 
matrix represents the norm |j • |jx- used in the mentioned paper) 

diag(Mfc -I- a^l'^Kk) 

diag(a“iMfc -b a~^l'^Kk) 



i.e., = diag(Qfc). Here and in what follows, diag(M) is the diagonal matrix 

containing the diagonal of a matrix M. For the Stokes control problem it was 
shown in [7] , that the approximation property is satisfied for the following choice 
of Ck'- 

fWk \ 

r = 

a-^Wu 

\ a-^PkJ 

where W/- := diag(Mfe + (Al’^KjA) and '■= a (i\ag{DkW^^D'^). 

Still, we have not specified the choice of the smoother, which now can be 
done using the convergence theory. We have seen for which choices of Ck the 
approximation property is satisfied. We are interested in a smoother such that 
the smoothing property is satished for the same choice of Ck- 

In |9I7) a normal equation smoother was proposed. This approach is applica¬ 
ble to a quite general class of problems, cf. [5] and others. In our notation, the 
normal equation smoother reads as follows: 


.^( 0 ,™)_ AO-m-l) 


rCl^AlCl^ (/,-A 




for m = 1,... ,v. 
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Here, a fixed r > 0 has to be chosen such that the spectral radius p{TA^^Ak) 
is bounded away from 2 on all grid levels k and for all choices of the parame¬ 
ters. It was shown that it is possible to find such an uniform r for the Poisson 
control problem, e.g., in [5] and for the Stokes control problem, e.g., in [7]. For 
the normal equation smoother, the smoothing property can be shown using a 
simple eigenvalue analysis, cf. [2]. Numerical experiments show that the normal 
equation smoother works rather well for the mentioned model problems. How¬ 
ever, there are smoothers such that the overall multigrid method converges much 
faster. Note that the normal equation smoother is basically a Richardson itera¬ 
tion scheme, applied to the normal equation. It is well-known for elliptic problems 
that Gauss Seidel iteration schemes are typically much better smoothers than 
Richardson iteration schemes. In the context of saddle point problems, the idea 
of Gauss Seidel smoothers has been applied, e.g., in the context of collective 
smoothers, see below. However, in the context of normal equation smoothers the 
idea of Gauss Seidel smoothers has not gained much attention. The setup of such 
an approach is straight forward: In compact notation such an approach, which 
we call least squares Gauss Seidel (LSGS) approach, reads as follows: 


^ trig(A4)-M^/: 


(h 


- Ak X 


(O.m-l) 

^k 


for TO = 1,..., 


Ak := 


where A4 := A^C^j^^Ak and trig(M) is a matrix whose coefficients coincide 
with the coefficients of M on the diagonal and the left-lower triangular part and 
vanish elsewhere. The author provides a possible realization of that approach as 
Algorithm to convince the reader that the computational complexity of the 
LSGS approach is equal to the computational complexity of the normal equation 
smoother, where a possible realization is given as Algorithm 

We will see below that the LSGS approach works very well in the numerical 
experiments. However, there is no proof of the smoothing property known to 
the author. This is due to the fact that the matrix Ak is not symmetric. One 
possibility to overcome this difficulty is to consider the symmetric version (sym¬ 
metric least squares Gauss Seidel approach, sLSGS approach). This is analogous 
to the case of elliptic problems: For elliptic problems the smoothing property 
for the symmetric Gauss Seidel iteration can be shown for general cases but for 
the standard Gauss Seidel iteration the analysis is restricted to special cases, cf. 
Section 6.2.4 in [3]. 

One step of the sLSGS iteration consists of one step of the LSGS iteration, 
followed by one step of the LSGS iteration with reversed order of the variables. 
(So the computational complexity of one step of the sLSGS iteration is equal 
to the computational complexity of two steps of the standard LSGS iteration.) 
One step of the sLSGS iteration reads as follows in compact notation: 


:= +^r,-'AlCP (4 - x<”'"-«) 

where A4 := trig(A4) diag(A4)“^ trig(A4)^- 


for TO = 1,.. ., 


( 7 ) 


For our needs, the following convergence lemma is sufficient. 
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Given: Iterate (xi)^i = and corresp. residual (ri)^i = / — ; 

Result: Iterate (xi)^i = and corresp. residual {ri)fL-i = f — 

for i = 1,..., do 
q := 0; 

for all j such that Aij ^ 0 do q := q + Ai,j jCj^j * r^; 
p. ;= r *q/£i,i; 

end 

for i = 1,..., do 

Xi := Xi + p.; 

for all j such that Aj,i ^ 0 do := Xj — Aj,i * p^; 
end 

Algorithm 1: Normal equation iteration scheme 


Given: Iterate (xi)fLi = and corresp. residual (ri)Hi 

Result: Iterate (xi)^i = and corresp. residual (ri)^i = 

Prepare once: A/i,i for alH = 1,..., A; 

for i = 1,..., A do 

q := 0; 

for all j such that Aij ^ 0 do q q + Ai,j jCj^j * r^; 

p := q/A/'i.i; 

Xi := Xi + p; 

for all j such that Aj,i ^ 0 do Xj := Xj — Aj,i * p; 
end 


Algorithm 2: LSGS iteration scheme 


f-Ax^°’^^-, 


Lemma 1. Assume that Ak is sparse, ([^ is satisfied and let Ck be a positive 
definite diagonal matrix such that 


IQ, 


1/2., 


,1/2 


k Xk\\<\\^k^Xk\ 


for all X).. 


( 8 ) 


Then the sLSGS approach satisfies the smoothing property (§, *.e., 

2-^l'^CmxL[Akfl'^ 


Cf^l^Akil-Mk^Mkrn 




< 


where nnz(M) is the maximum number of non-zero entries per row of M. 

Note that is a standard inverse inequality, which is satisfied for both model 
problems, cf. mz]. Note moreover that this assumption also has to be satisfied 
to show the smoothing property for the normal equation smoother, cf. m- 
Proof of Lemma^ The combination of ([^ and (|^ yields \\Cj. AkT-k ^^^11 — 
C. Prop. 6.2.27 in [3] states that for any symmetric positive definite matrix A4 

holds, where A4 is as in Q. Using Vk '■= diag(A4), we obtain 

||£;V2_^1/2||2 ^^(^-l/2_^^^-l/2) < ||£-l/\rig(A/fc)I?,-'/"f 


( 9 ) 
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Let Ak = ^fk = Ck = and ip^i) := {j G N : 

Afij ^ 0}. We obtain using Gerschgorin’s theorem, the fact that the infinity 
norm is monotone in the matrix entries, and using the symmetry of A4 and A^ 
and Cauchy-Schwarz inequality: 


IIP 


- 1/2 


trig(A4)P, 


- 1 / 2 , 


< ||p-'/"trig(A4)Pfc'/"trig(A4)^P;'/1i/" < ||Pfc 


= max 


E (e#^ 

feGi/’id V"=i 



< max > 1 = nnz(A4) < nnzly^r) . 

i=l,....lV 

kGtp(i) 


- 1/2 


( 10 ) 


Further, we obtain 




= \\C 


- 1 / 2 ^ 1 / 2||2 


= \\C 


- 1/2 


VkC 


k^k 


-1/2|| 


N 


= max / „ „ 

i=i,...,Ar 2—/ 

1=1 




Af 


'3,J 
^2 


< nnz(^fc) max ——b/— = nnz(^fc)||£ i/2||2 ^ nnz(^fe) C . 


( 11 ) 


By combining ([To|) and (|Il|), we obtain 


11^ 


-^^^Akii-A^j^kTc-^^Y 

<\\C-^^\l-MkK^rAkC^^Ak{I-M^^Mkrc-^^^\\ 


which finishes the proof. □ 

We went to compare the numerical behavior of the LSGS approach with the 
behavior of a standard smoother. One class of standard smoothers for saddle 
point problems is the class of Vanka type smoothers, which has been originally 
introduced for Stokes problems, cf. m- Such smoothers have also gained interest 
for optimal control problems, see, e.g., mm- 

The idea of Vanka type smoothers is to compute updates in subspaces directly 
for the whole saddle point problem and to combine these updates is an additive 
or a multiplicative way to compute the next update. Here, the variables are 
not grouped based on the block-structure of Ak, but the grouping is done of 
based on the location of the corresponding degrees of freedom in the domain 17. 
The easiest of such ideas for the Poisson control problems is to do the grouping 
point-wise, which leads to the idea of point smoothing. Here, we group for each 
node Si of the discretization (each degree of freedom of the Gourant element) the 
value Pi of the state and the value Xi of the Lagrange multiplier and compute 
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an update in the corresponding subspace. The multiplicative variant of such a 
smoother is a collective Gauss Seidel (CGS) smoother: 






1) 




{i) 1 T,(i) - 




-1 


■ 




- Ai. 


,i-l) 


1 (0,m,0) 

where xj. ■= 


(0,m—1) 


matrix V, 


(d 


and := 


For each f = 1,..., Nk, the 
takes the value 1 on the positions (i, 1) and (i + Nk,2) 


and the value 0 elsewhere. For the Poisson control problem, we obtain 



AkV 


(i) 

k 


( M,,, \ 


where Mi^i and Ki^i are the entries of the matrices Mk and Kk- 

For the Stokes control problem, it is not reasonable to use exactly the same 
approach. This is basically due to the fact that the degrees of freedom for v and 
A are not located on the same positions as the degrees of freedom for p and p. 
However, we can introduce an approach based on patches: so, for each vertex of 
the triangulation, we consider subspaces that consist of the degrees of freedoms 
located on the vertex itself and the degrees of freedom located on all edges which 
have one end at the chosen vertex, cf. Fig. Note that here the subspaces are 



Fig. 1. Patches for the Vanka-type smoother applied to a Taylor Hood discretization. 
The dots are the degrees of freedom of v and A, the rectangles are the degrees of 
freedom of p and p 

much larger than the subspaces chosen in the case of the CGS approach for the 
Poisson control problem (which was just 2). This increases the computational 
cost of applying the method significantly. For Vanka type smoothers there are 
only a few convergence results known, cf. [T] for a Fourier Analysis and an 
analysis based compactness argument and [S] for a proof based on Hackbusch’s 
splitting of the analysis into smoothing property and approximation property 
which shows the convergence in case of a collective Richardson smoother. 

3 Numerical results 

In this section we give numerical results to illustrate quantitatively the conver¬ 
gence behavior of the proposed methods. The number of iterations was measured 
as follows: We start with a random initial guess and iterate until the relative error 
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in the norm || • ||£^ was reduced by a factor of 10“®. Without loss of generality, 
the right-hand side was chosen to be 0. For both model problems, the normal 
equation smoother, the LSGS smoother, the sLSGS smoother and a Vanka type 
smoother have been applied. For the smoothers 2 pre- and 2 post-smoothing 
steps have been applied. Only for the sLSGS smoother, just 1 pre- and 1 post¬ 
smoothing step has been applied. This is due to the fact that one step of the 
symmetric version is basically the same computational cost as two steps of the 
standard version. The normal equation smoother was damped with t = 0.4 
for the Poisson control problem and r = 0.35 for the Stokes control problem, 
cf. [917] . For the Gauss Seidel-like approaches, damping was not used. 

In Table[^ we give the results for the standard Poisson control problem. Here, 
we see that all smoothers lead to convergence rates that are well bounded for a 
wide range of hk and a. Compared to the normal equation smoother, the LSGS 
smoother leads to a speedup be a factor of about two without any additional 
work. The symmetric version (sLSGS) is a bit slower than the LSGS method. 
For the first model problem, the (popular) CGS method is significantly faster. 
However, for this method no convergence theory is known. 




Normal equation 

LSGS 


sLSGS 


GGS 


a 

= 

10° 

10’ 

- 610-12 

10° 

10’ 

-6 10-12 

10° 

10" 

-6 10-12 

10° 

10' 

-6 10-12 

k 

= 5 

26 

31 

28 

11 

9 

7 

14 

12 

14 

5 

5 

3 

k 

= 6 

27 

28 

29 

11 

11 

7 

14 

14 

13 

5 

5 

3 

k 

= 7 

27 

28 

31 

11 

11 

6 

14 

14 

12 

5 

5 

3 

k 

= 8 

27 

27 

25 

11 

11 

3 

14 

14 

7 

5 

5 

4 


Table 1. Number of iterations for the Poisson control model problem 


In Table we give the convergence results for the Stokes control problem. 
Also here we observe that the LSGS and the sLSGS approach lead to a speedup 
of a factor of about two compared to the normal equation smoother. Here, the 
Vanka type smoother shows slightly smaller iteration numbers than the LSGS 
approach. In terms of computational costs, the LSGS smoother seems to be 
much better than the patch-based Vanka type smoother because there relatively 
large subproblems have to be solved to compute the updates. This is different 
the case of the CGS smoother, where the subproblems are just 2-by-2 linear 
systems. Numerical experiments have shown that the undamped version of the 
patch-based Vanka type method does not lead to a convergent multigrid method. 
So, this smoother was damped with r = 0.4. Due to lack of convergence theory, 
the author cannot explain why this approach - although it is a multiplicative 
approach - needs damping. 

For completeness, the author wants to mention that for cases, where a (closed 
form of a) matrix Qk satisfying robustly is not known, the normal equation 
smoother does not show as good results as methods where such an information 
is not needed, like Vanka type methods. This was discussed in [5] for a bound- 
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Normal equation LSGS sLSGS Vanka type 


a 

= 

10° 

10’ 

-6 10-12 

10° 

10- 

-6 10-12 

10° 

10’ 

-6 10-12 

10° 

10’ 

-6 10-12 

k 

= 4 

31 

31 

60 

13 

12 

14 

17 

16 

22 

11 

10 

7 

k 

= 5 

32 

30 

55 

14 

13 

12 

18 

16 

19 

11 

10 

7 

k 

= 6 

32 

31 

44 

14 

13 

9 

18 

17 

12 

11 

11 

7 

k 

= 7 

32 

31 

37 

14 

14 

6 

18 

17 

9 

11 

11 

9 


Table 2. Number of iterations for the Stokes control model problem 


ary control problem, but it is also true for the linearization of optimal control 
problems with inequality constraints as discussed in [3] and others. The same is 
true for the Gauss Seidel like variants of the normal equation smoother. 

Concluding, we have observed that accelerating the idea of normal equation 
smoothing with a Gauss Seidel approach, leads to a speedup of a factor of about 
two without any further work. The fact that convergence theory is known for 
the sLSGS approach, helps also for the numerical practice (unlike the case of 
Vanka type smoothers). 
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